Goto

Collaborating Authors

 breast cancer patient


A Goemans-Williamson type algorithm for identifying subcohorts in clinical trials

Worah, Pratik

arXiv.org Artificial Intelligence

We design an efficient algorithm that outputs tests for identifying predominantly homogeneous subcohorts of patients from large in-homogeneous datasets. Our theoretical contribution is a rounding technique, similar to that of Goemans and Wiliamson (1995), that approximates the optimal solution within a factor of $0.82$. As an application, we use our algorithm to trade-off sensitivity for specificity to systematically identify clinically interesting homogeneous subcohorts of patients in the RNA microarray dataset for breast cancer from Curtis et al. (2012). One such clinically interesting subcohort suggests a link between LXR over-expression and BRCA2 and MSH6 methylation levels for patients in that subcohort.


Can Artificial Intelligence Generate Quality Research Topics Reflecting Patient Concerns?

Kim, Jiyeong, Chen, Michael L., Rezaei, Shawheen J., Ramirez-Posada, Mariana, Caswell-Jin, Jennifer L., Kurian, Allison W., Riaz, Fauzia, Sarin, Kavita Y., Tang, Jean Y., Asch, Steven M., Linos, Eleni

arXiv.org Artificial Intelligence

Patient-centered research is increasingly important in narrowing the gap between research and patient care, yet incorporating patient perspectives into health research has been inconsistent. We propose an automated framework leveraging innovative natural language processing (NLP) and artificial intelligence (AI) with patient portal messages to generate research ideas that prioritize important patient issues. We further quantified the quality of AI-generated research topics. To define patient clinical concerns, we analyzed 614,464 patient messages from 25,549 individuals with breast or skin cancer obtained from a large academic hospital (2013 to 2024), constructing a 2-staged unsupervised NLP topic model. Then, we generated research topics to resolve the defined issues using a widely used AI (ChatGPT-4o, OpenAI Inc, April 2024 version) with prompt-engineering strategies. We guided AI to perform multi-level tasks: 1) knowledge interpretation and summarization (e.g., interpreting and summarizing the NLP-defined topics), 2) knowledge generation (e.g., generating research ideas corresponding to patients issues), 3) self-reflection and correction (e.g., ensuring and revising the research ideas after searching for scientific articles), and 4) self-reassurance (e.g., confirming and finalizing the research ideas). Six highly experienced breast oncologists and dermatologists assessed the significance and novelty of AI-generated research topics using a 5-point Likert scale (1-exceptional, 5-poor). One-third of the AI-suggested research topics were highly significant and novel when both scores were lower than the average. Two-thirds of the AI-suggested topics were novel in both cancers. Our findings demonstrate that AI-generated research topics reflecting patient perspectives via a large volume of patient messages can meaningfully guide future directions in patient-centered health research.


Predicting Breast Cancer Survival: A Survival Analysis Approach Using Log Odds and Clinical Variables

Alamu, Opeyemi Sheu, Choque, Bismar Jorge Gutierrez, Rizvi, Syed Wajeeh Abbs, Hammed, Samah Badr, Medani, Isameldin Elamin, Siam, Md Kamrul, Tahir, Waqar Ahmad

arXiv.org Artificial Intelligence

Breast cancer remains a significant global health challenge, with prognosis and treatment decisions largely dependent on clinical characteristics. Accurate prediction of patient outcomes is crucial for personalized treatment strategies. This study employs survival analysis techniques, including Cox proportional hazards and parametric survival models, to enhance the prediction of the log odds of survival in breast cancer patients. Clinical variables such as tumor size, hormone receptor status, HER2 status, age, and treatment history were analyzed to assess their impact on survival outcomes. Data from 1557 breast cancer patients were obtained from a publicly available dataset provided by the University College Hospital, Ibadan, Nigeria. This dataset was preprocessed and analyzed using both univariate and multivariate approaches to evaluate survival outcomes. Kaplan-Meier survival curves were generated to visualize survival probabilities, while the Cox proportional hazards model identified key risk factors influencing mortality. The results showed that older age, larger tumor size, and HER2-positive status were significantly associated with an increased risk of mortality. In contrast, estrogen receptor positivity and breast-conserving surgery were linked to better survival outcomes. The findings suggest that integrating these clinical variables into predictive models improvesthe accuracy of survival predictions, helping to identify high-risk patients who may benefit from more aggressive interventions. This study demonstrates the potential of survival analysis in optimizing breast cancer care, particularly in resource-limited settings. Future research should focus on integrating genomic data and real-world clinical outcomes to further refine these models.


Some breast cancer patients could be at risk of another type of cancer, study reveals

FOX News

Victoria Raphael of New York City reveals her success story -- and her decision to freeze her eggs after she was diagnosed with cancer. Women with breast cancer who have received chemotherapy are at an increased risk of developing lung cancer, a new study suggests. Epic Research, a health data group based in Delaware, found that women in this category have a 57% higher lung cancer risk than those who received radiation. In comparison to patients who received endocrine therapy, those who have undergone chemo have a 171% increase in lung cancer risk, the study found. In a statement sent to Fox News Digital, the Epic Research team said the key takeaway from their research is that primary lung cancer is more than twice as prevalent in women who were previously diagnosed with breast cancer -- compared to those who did not have it.


Survival Analysis of Young Triple-Negative Breast Cancer Patients

O, M. Mehdi Owrang, Horestani, Fariba Jafari, Schwarz, Ginger

arXiv.org Artificial Intelligence

Breast cancer prognosis is crucial for effective treatment, with the disease more common in women over 40 years old but rare under 40 years old, where less than 5 percent of cases occur in the U.S. Studies indicate a worse prognosis in younger women, which varies by ethnicity. Breast cancers are classified based on receptors like estrogen, progesterone, and HER2. Triple-negative breast cancer (TNBC), lacking these receptors, accounts for about 15 percent of cases and is more prevalent in younger patients, often resulting in poorer outcomes. Nevertheless, the impact of age on TNBC prognosis remains unclear. Factors like age, race, tumor grade, size, and lymph node status are studied for their role in TNBC's clinical outcomes, but current research is inconclusive about age-related differences. This study uses SEER data set to examine the influence of younger age on survivability in TNBC patients, aiming to determine if age is a significant prognostic factor. Our experimental results on SEER dataset confirm the existing research reports that TNBC patients have worse prognosis compared to non-TNBC based on age. Our main goal was to investigate whether younger age has any significance on the survivability of TNBC patients. Experimental results do not show that younger age has any significance on the prognosis and survival rate of the TNBC patients


Revealing the impact of social circumstances on the selection of cancer therapy through natural language processing of social work notes

Sun, Shenghuan, Zack, Travis, Williams, Christopher Y. K., Butte, Atul J., Sushil, Madhumita

arXiv.org Artificial Intelligence

We aimed to investigate the impact of social circumstances on cancer therapy selection using natural language processing to derive insights from social worker documentation. We developed and employed a Bidirectional Encoder Representations from Transformers (BERT) based approach, using a hierarchical multi-step BERT model (BERT-MS) to predict the prescription of targeted cancer therapy to patients based solely on documentation by clinical social workers. Our corpus included free-text clinical social work notes, combined with medication prescription information, for all patients treated for breast cancer. We conducted a feature importance analysis to pinpoint the specific social circumstances that impact cancer therapy selection. Using only social work notes, we consistently predicted the administration of targeted therapies, suggesting systematic differences in treatment selection exist due to non-clinical factors. The UCSF-BERT model, pretrained on clinical text at UCSF, outperformed other publicly available language models with an AUROC of 0.675 and a Macro F1 score of 0.599. The UCSF BERT-MS model, capable of leveraging multiple pieces of notes, surpassed the UCSF-BERT model in both AUROC and Macro-F1. Our feature importance analysis identified several clinically intuitive social determinants of health (SDOH) that potentially contribute to disparities in treatment. Our findings indicate that significant disparities exist among breast cancer patients receiving different types of therapies based on social determinants of health. Social work reports play a crucial role in understanding these disparities in clinical decision-making.


Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data

Farooq, Mugariya, Hardan, Shahad, Zhumbhayeva, Aigerim, Zheng, Yujia, Nakov, Preslav, Zhang, Kun

arXiv.org Artificial Intelligence

The need for more usable and explainable machine learning models in healthcare increases the importance of developing and utilizing causal discovery algorithms, which aim to discover causal relations by analyzing observational data. Explainable approaches aid clinicians and biologists in predicting the prognosis of diseases and suggesting proper treatments. However, very little research has been conducted at the crossroads between causal discovery, genomics, and breast cancer, and we aim to bridge this gap. Moreover, evaluation of causal discovery methods on real data is in general notoriously difficult because ground-truth causal relations are usually unknown, and accordingly, in this paper, we also propose to address the evaluation problem with large language models. In particular, we exploit suitable causal discovery algorithms to investigate how various perturbations in the genome can affect the survival of patients diagnosed with breast cancer. We used three main causal discovery algorithms: PC, Greedy Equivalence Search (GES), and a Generalized Precision Matrix-based one. We experiment with a subset of The Cancer Genome Atlas, which contains information about mutations, copy number variations, protein levels, and gene expressions for 705 breast cancer patients. Our findings reveal important factors related to the vital status of patients using causal discovery algorithms. However, the reliability of these results remains a concern in the medical domain. Accordingly, as another contribution of the work, the results are validated through language models trained on biomedical literature, such as BlueBERT and other large language models trained on medical corpora. Our results profess proper utilization of causal discovery algorithms and language models for revealing reliable causal relations for clinical applications.


Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients

Ribeiro-Dantas, Marcel da Câmara, Li, Honghao, Cabeli, Vincent, Dupuis, Louise, Simon, Franck, Hettal, Liza, Hamy, Anne-Sophie, Isambert, Hervé

arXiv.org Artificial Intelligence

Discovering causal effects is at the core of scientific investigation but remains challenging when only observational data is available. In practice, causal networks are difficult to learn and interpret, and limited to relatively small datasets. We report a more reliable and scalable causal discovery method (iMIIC), based on a general mutual information supremum principle, which greatly improves the precision of inferred causal relations while distinguishing genuine causes from putative and latent causal effects. We showcase iMIIC on synthetic and real-life healthcare data from 396,179 breast cancer patients from the US Surveillance, Epidemiology, and End Results program. More than 90\% of predicted causal effects appear correct, while the remaining unexpected direct and indirect causal effects can be interpreted in terms of diagnostic procedures, therapeutic timing, patient preference or socio-economic disparity. iMIIC's unique capabilities open up new avenues to discover reliable and interpretable causal networks across a range of research fields.


AI-based tool set to improve breast cancer diagnosis

#artificialintelligence

An AI-based tool that improves breast cancer diagnosis and predicts the risk of recurrence has been developed by researchers in Sweden. The advance from a team at the Karolinska Institutet could lead to more personalised treatment for breast cancer patients with intermediate risk tumours. The results are published in Annals of Oncology. In the diagnostic procedure for breast cancer, tissue samples of the tumour are analysed and graded by a pathologist and categorised by risk as low (grade 1), medium (grade 2) or high (grade 3), which guides decisions on the most suitable treatment. "Roughly half of breast cancer patients have a grade 2 tumour, which unfortunately gives no clear guidance on how the patient is to be treated," said Yinxi Wang, a doctoral student at the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet.


Explainable Artificial Intelligence Reveals Novel Insight into Tumor Microenvironment Conditions Linked with Better Prognosis in Patients with Breast Cancer

Chakraborty, Debaditya, Ivan, Cristina, Amero, Paola, Khan, Maliha, Rodriguez-Aguayo, Cristian, Başağaoğlu, Hakan, Lopez-Berestein, Gabriel

arXiv.org Artificial Intelligence

We investigated the data-driven relationship between features in the tumor microenvironment (TME) and the overall and 5-year survival in triple-negative breast cancer (TNBC) and non-TNBC (NTNBC) patients by using Explainable Artificial Intelligence (XAI) models. We used clinical information from patients with invasive breast carcinoma from The Cancer Genome Atlas and from two studies from the cbioPortal, the PanCanAtlas project and the GDAC Firehose study. In this study, we used a normalized RNA sequencing data-driven cohort from 1,015 breast cancer patients, alive or deceased, from the UCSC Xena data set and performed integrated deconvolution with the EPIC method to estimate the percentage of seven different immune and stromal cells from RNA sequencing data. Novel insights derived from our XAI model showed that CD4+ T cells and B cells are more critical than other TME features for enhanced prognosis for both TNBC and NTNBC patients. Our XAI model revealed the critical inflection points (i.e., threshold fractions) of CD4+ T cells and B cells above or below which 5-year survival rates improve. Subsequently, we ascertained the conditional probabilities of $\geq$ 5-year survival in both TNBC and NTNBC patients under specific conditions inferred from the inflection points. In particular, the XAI models revealed that a B-cell fraction exceeding 0.018 in the TME could ensure 100% 5-year survival for NTNBC patients. The findings from this research could lead to more accurate clinical predictions and enhanced immunotherapies and to the design of innovative strategies to reprogram the TME of breast cancer patients.